Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

postgresql: support JIT #150801

Closed
wants to merge 2 commits into from
Closed

postgresql: support JIT #150801

wants to merge 2 commits into from

Conversation

abbradar
Copy link
Member

Motivation for this change

JIT compilation of PostgreSQL queries. Disabled by default for now.

Things done
  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandbox = true set in nix.conf? (See Nix manual)
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 22.05 Release Notes (or backporting 21.11 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
    • (Release notes changes) Ran nixos/doc/manual/md-to-db.sh to update generated release notes
  • Fits CONTRIBUTING.md.

Tested by EXPLAIN ANALYZEing queries with JIT force-enabled.

@ofborg ofborg bot added 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild 10.rebuild-linux: 0 This PR does not cause any packages to rebuild labels Dec 15, 2021
@abbradar
Copy link
Member Author

Given that it's optional I'll merge that tomorrow if no questions arise.

@marsam
Copy link
Contributor

marsam commented Dec 16, 2021

Please see the discussion in previous attempt #124804.
You also need to include a test to verify that JIT works

This both makes the code cleaner and allows for expressions like:

`(postgresql.override { jitSupport = true; }).pkgs.postgis`

The change also allows users to configure PostgreSQL flags in NixOS
module more conveniently:

```
{
  services.postgresql.package = pkgs.postgresql.override { jitSupport = true; };
  services.postgresql.extraPlugins = with config.services.postgresql.package.pkgs; [ postgis pg_repack ];
}
```

Before they were required to override `postgresql` in an overlay or
pass `this` explicitly.
Disable it by default because of closure size, and to avoid surprise
negative performance changes in production (were observed with very
large and complex queries due to compilation time).

See also dcd9455 for a great
explanation on how JIT interacts with packaging and closure size.
@github-actions github-actions bot added 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 8.has: module (update) This PR changes an existing module in `nixos/` labels Dec 18, 2021
@abbradar
Copy link
Member Author

@marsam Thanks for pointing the prior work up, it was very helpful!

I added an assertion to ensure that jitSupport is not used on older PostgreSQL versions and cherry-picked @thoughtpolice's patch to move bitcode to $out. It turned out (as shown in my patch before) it's not necessary to use clangStdenv -- PostgreSQL can be compiled with GCC while using clang only for bitcode generation. This way we also don't have new stdenv.cc references, so 8caf523 is not necessary.

I also changed recursion scheme for PostgreSQL to a more widely used let this = ...; in this. This allows one to override postgresql flags in a friendlier way than it was before, when the derivation effectively looked itself up in the package set.

I'm not sure about tests, because adding them would mean also adding postgresqlJit to our package set (we don't build it with JIT by default right now). I did test before that JIT works in exactly the same way @andir's tests do. What do you think @marsam?

@ofborg ofborg bot added 10.rebuild-linux: 1-10 and removed 10.rebuild-linux: 0 This PR does not cause any packages to rebuild labels Dec 18, 2021
@abbradar
Copy link
Member Author

abbradar commented Dec 18, 2021

Actually I re-tested it right now, and it turns out 91841ff breaks JIT (so I removed it from this PR).

I also considered #124804 (comment), but I'm honestly afraid to do so much (possibly fragile) patching for a feature that's not even enabled by default.

@abbradar
Copy link
Member Author

abbradar commented Dec 18, 2021

I accidentally noticed that Python uses the same way of getting a self reference as PostgreSQL here: instead of let this = stdenv.mkDerivation {...}; in this it receives a this argument, which is passed from the surrounding code. Because of that this is not actually a reference to the same package, but to the package that is defined in an attrset somewhere. Usually it's the same, but that is not so when arguments are locally overridden, or perhaps for weird overlays (both points don't apply to Python as much).

I can't grasp any merits of this implementation; let leads to no clunky postgresql = ... { self = postgresql; } boilerplate, and current scheme leads to unexpected results when evaluating, say, (postgresql.override { jitSupport = true; }).pkgs.postgis. However I feel it must be something there that I don't see -- what is it?

@abbradar
Copy link
Member Author

@marsam I propose to go forward for now and change the recursion implementation -- this will allow (postgresql.override { jitSupport = true; }).pkgs.postgis, among other things.

Do you have an opinion on whether/how to implement tests?

@abbradar
Copy link
Member Author

@marsam Kind ping -- do you still consider adding postgresqlJit to our package set and testing a necessity?

@abbradar
Copy link
Member Author

@marsam I'm sorry for the noise, but can you take a quick look? These changes feel low-risk to me given default PostgreSQL build left as is, so I'd just go forward but it feels uncomfortable given you raised the point about testing. If you feel we should build JIT version on Hydra I can add a NixOS test -- my concern is about building a package which most users won't need.

@stale stale bot added the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Sep 21, 2022
@risicle risicle mentioned this pull request Mar 4, 2023
12 tasks
@RaitoBezarius
Copy link
Member

@marsam I'm sorry for the noise, but can you take a quick look? These changes feel low-risk to me given default PostgreSQL build left as is, so I'd just go forward but it feels uncomfortable given you raised the point about testing. If you feel we should build JIT version on Hydra I can add a NixOS test -- my concern is about building a package which most users won't need.

Apologies for the delay, can I help you to get this merged?

@stale stale bot removed the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Mar 5, 2023
@Ma27
Copy link
Member

Ma27 commented Mar 14, 2023

@RaitoBezarius First of all, I'm also interested in this (in fact I'm using something similar as overlay for my personal infrastructure). Since it doesn't seem that @abbradar is that active these days I'd suggest the following approach:

I think the increase in closure size is OK since it's just an opt-in feature. However @dasJ do you still use the code from #124804 (comment)? Perhaps we could use that in here as well to get a bit of closure size reduction.

Do you want to take care of that? Otherwise I can try to do that soonish (though I'm kinda involved with exams these days).

@dasJ
Copy link
Member

dasJ commented Mar 14, 2023

@Ma27 current code:

    package = mkOption {
      type = package;
      description = "PostgreSQL package to use";
      default = pkgs.postgresql_13;
      defaultText = "pkgs.postgresql_13";
      # Patch in JIT support if needed
      # See https://github.com/NixOS/nixpkgs/pull/124804#issuecomment-864616931
      apply = x: if cfg.enableJit then (x.overrideAttrs (oA: {
        nativeBuildInputs = (oA.nativeBuildInputs or []) ++ [ pkgs.llvmPackages_latest.llvm pkgs.nukeReferences pkgs.patchelf ];
        configureFlags = (oA.configureFlags or []) ++ [ "--with-llvm" ];
        postPatch = ''
          ${oA.postPatch or ""}

          # Force lookup of jit stuff in $out instead of $lib
          substituteInPlace src/backend/jit/jit.c --replace pkglib_path \"$out/lib\"
          substituteInPlace src/backend/jit/llvm/llvmjit.c --replace pkglib_path \"$out/lib\"
          substituteInPlace src/backend/jit/llvm/llvmjit_inline.cpp --replace pkglib_path \"$out/lib\"
        '';

        # upstream enabled pie, which breaks the build: https://github.com/NixOS/nixpkgs/pull/124188
        hardeningEnable = [];

        postInstall = ''
          ${oA.postInstall or ""}

          # Move the bitcode and libllvmjit.so library out of $lib; otherwise, every client that
          # depends on libpq.so will also have libLLVM.so in its closure too, bloating it
          moveToOutput "lib/bitcode" "$out"
          moveToOutput "lib/llvmjit*" "$out"

          # In the case of JIT support, prevent a retained dependency on clang-wrapper
          substituteInPlace "$out/lib/pgxs/src/Makefile.global" --replace ${pkgs.llvmPackages_latest.stdenv.cc}/bin/clang clang
          nuke-refs $out/lib/llvmjit_types.bc $(find $out/lib/bitcode -type f)

          # Stop out depending on the default output of llvm
          substituteInPlace $out/lib/pgxs/src/Makefile.global \
            --replace ${pkgs.llvmPackages_latest.llvm.out}/bin "" \
            --replace '$(LLVM_BINPATH)/' ""

          # Stop out depending on the -dev output of llvm
          substituteInPlace $out/lib/pgxs/src/Makefile.global \
            --replace ${pkgs.llvmPackages_latest.llvm.dev}/bin/llvm-config llvm-config \
            --replace -I${pkgs.llvmPackages_latest.llvm.dev}/include ""

          # Stop lib depending on the -dev output of llvm
          rpath=$(patchelf --print-rpath $out/lib/llvmjit.so)
          nuke-refs -e $out $out/lib/llvmjit.so
          # Restore the correct rpath
          patchelf $out/lib/llvmjit.so --set-rpath "$rpath"
        '';
      })).override {
        stdenv = pkgs.llvmPackages_latest.stdenv;
      } else x;
    };

and yes, that code is running in production ;)

@RaitoBezarius
Copy link
Member

@RaitoBezarius First of all, I'm also interested in this (in fact I'm using something similar as overlay for my personal infrastructure). Since it doesn't seem that @abbradar is that active these days I'd suggest the following approach:

Awesome!

Got it.

I think the increase in closure size is OK since it's just an opt-in feature. However @dasJ do you still use the code from #124804 (comment)? Perhaps we could use that in here as well to get a bit of closure size reduction.

Do you want to take care of that? Otherwise I can try to do that soonish (though I'm kinda involved with exams these days).

I cannot take care of anything yet but I am open to reviewing :).

@Ma27
Copy link
Member

Ma27 commented Mar 15, 2023

OK, so first of all I'll test the changes of @dasJ (and ensure that I understand all of it, not sure about the patchelf --set-rpath right now, but that's probably just me). If all of that works out, I'll try to integrate this into this PR and make it reviewable/mergeable.

@Ma27
Copy link
Member

Ma27 commented Mar 16, 2023

Have a more or less working draft locally, need to check if all extensions are still fine and write something for the release notes. Would file a new PR at some point the next days :)

Ma27 added a commit to Ma27/nixpkgs that referenced this pull request Mar 29, 2023
Closes NixOS#150801

Note: I decided against resuming directly on NixOS#150801 because the
conflict was too big (and resolving it seemed too error-prone to me).
Also the `this`-refactoring could be done in an easier manner, i.e. by
exposing JIT attributes with the correct configuration. More on that
below.

This patch creates variants of the `postgresql*`-packages with JIT[1]
support. Please note that a lot of the work was derived from previous
patches filed by other contributors, namely dasJ, andir and abbradar,
hence the co-authored-by tags below.

Effectively, the following things have changed:

* For JIT variants an LLVM-backed stdenv with clang is now used as
  suggested by dasJ[2]. We need LLVM and CLang[3] anyways to build the
  JIT-part, so no need to mix this up with GCC's stdenv. Also, using the
  `dev`-output of LLVM and clang's stdenv for building (and adding llvm
  libs as build-inputs) seems more cross friendly to me (which will
  become useful when cross-building for JIT-variants will actually be
  supported).

* Plugins inherit the build flags from the Makefiles in
  `$out/lib/pgxs/src` (e.g. `-Werror=unguarded-availability-new`). Since
  some of the flags are clang-specific (and stem from the use of the
  CLang stdenv) and don't work on gcc, the stdenv of `pkgs.postgresql`
  is passed to the plugins. I.e., plugins for non-JIT variants are built
  with a gcc stdenv on Linux and plugins for JIT variants with a clang
  stdenv.

  Since `plv8` hard-codes `gcc` as `$CC` in its Makefile[4], I marked it
  as broken for JIT-variants of postgresql only.

* Added a test-matrix to confirm that JIT works fine on each
  `pkgs.postgresql_*_jit` (thanks Andi for the original test in
  NixOS#124804!).

* For each postgresql version, a new attribute
  `postgresql_<version>_jit` (and a corresponding
  `postgresqlPackages<version>JitPackages`) are now exposed for better
  discoverability and prebuilt artifacts in the binary cache.

* In NixOS#150801 the `this`-argument was replaced by an internal recursion.
  I decided against this approach because it'd blow up the diff even
  more which makes the readability way harder and also harder to revert
  this if necessary.

  Instead, it is made sure that `this` always points to the correct
  variant of `postgresql` and re-using that in an additional
  `.override {}`-expression is trivial because the JIT-variant is
  exposed in `all-packages.nix`.

* I think the changes are sufficiently big to actually add myself as
  maintainer here.

* Added `libxcrypt` to `buildInputs` for versions <v13. While
  building things with an LLVM stdenv, these versions complained that
  the extern `crypt()` symbol can't be found. Not sure what this is
  exactly about, but since we want to switch to libxcrypt for `crypt()`
  usage anyways[5] I decided to add it. For >=13 it's not relevant
  anymore anyways[6].

* JIT support doesn't work with cross-compilation. It is attempted to
  build LLVM-bytecode (`%.bc` is the corresponding `make(1)`-rule) for
  each sub-directory in `backend/` for the JIT apparently, but with a
  $(CLANG) that can produce binaries for the build, not the host-platform.

  I managed to get a cross-build with JIT support working with
  `depsBuildBuild = [ llvmPackages.clang ] ++ buildInputs`, but
  considering that the resulting LLVM IR isn't platform-independent this
  doesn't give you much. In fact, I tried to test the result in a VM-test,
  but as soon as JIT was used to optimize a query, postgres would
  coredump with `Illegal instruction`.

A common concern of the original approach - with llvm as build input -
was the massive increase of closure size. With the new approach of using
the LLVM stdenv directly and patching out references to the clang drv in
`$out` the effective closure size changes are:

    $ nix path-info -Sh $(nix-build -A postgresql_14)
    /nix/store/kssxxqycwa3c7kmwmykwxqvspxxa6r1w-postgresql-14.7	306.4M
    $ nix path-info -Sh $(nix-build -A postgresql_14_jit)
    /nix/store/xc7qmgqrn4h5yr4vmdwy56gs4bmja9ym-postgresql-14.7	689.2M

Most of the increase in closure-size stems from the `lib`-output of
LLVM

    $ nix path-info -Sh /nix/store/5r97sbs5j6mw7qnbg8nhnq1gad9973ap-llvm-11.1.0-lib
    /nix/store/5r97sbs5j6mw7qnbg8nhnq1gad9973ap-llvm-11.1.0-lib	349.8M

which is why this shouldn't be enabled by default.

While this is quite much because of LLVM, it's still a massive
improvement over the simple approach of adding llvm/clang as
build-inputs and building with `--with-llvm`:

    $ nix path-info -Sh $(nix-build -E '
	with import ./. {};
	postgresql.overrideAttrs ({ configureFlags ? [], buildInputs ? [], ... }: {
	  configureFlags = configureFlags ++ [ "--with-llvm" ];
	  buildInputs = buildInputs ++ [ llvm clang ];
	})' -j0)
    /nix/store/i3bd2r21c6c3428xb4gavjnplfqxn27p-postgresql-14.7	  1.6G

Co-authored-by: Andreas Rammhold <[email protected]>
Co-authored-by: Janne Heß <[email protected]>
Co-authored-by: Nikolay Amiantov <[email protected]>

[1] https://www.postgresql.org/docs/current/jit-reason.html
[2] NixOS#124804 (comment)
    & NixOS#150801 (comment)
[3] This fails with the following error otherwise:
    ```
    configure: error: clang not found, but required when compiling --with-llvm, specify with CLANG=
    ```
[4] https://github.com/plv8/plv8/blob/v3.1.5/Makefile#L14
[5] NixOS#181764
[6] postgres/postgres@c45643d
wolfgangwalther added a commit to wolfgangwalther/nixpkgs that referenced this pull request Mar 15, 2024
This was proposed by abbradar in NixOS#150801, but left out of the follow up PR
NixOS#221851 by Ma27 to reduce the size of the diff. Compared to the initial
proposal this includes the callPackage call in the recursion, which avoids
breaking the withJIT/withoutJIT helpers.

In terms of nixpkgs, this is a pure refactor, no derivations change. However,
this makes downstream expressions like the following possible:

  (postgresql.override { jitSupport = true; }).pkgs.postgis

This would have not worked before without passing another "this" argument,
which is error prone as can be seen in this example:

  https://github.com/PostgREST/postgrest/pull/3222/files
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 8.has: module (update) This PR changes an existing module in `nixos/` 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild 10.rebuild-linux: 1-10
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants